Building basic vocabulary across 40 languages
نویسندگان
چکیده
The paper explores the options for building bilingual dictionaries by automated methods. We define the notion ‘basic vocabulary’ and investigate how well the conceptual units that make up this language-independent vocabulary are covered by language-specific bindings in 40 languages.
منابع مشابه
Cultural Phylogenetics of the Tupi Language Family in Lowland South America
BACKGROUND Recent advances in automated assessment of basic vocabulary lists allow the construction of linguistic phylogenies useful for tracing dynamics of human population expansions, reconstructing ancestral cultures, and modeling transition rates of cultural traits over time. METHODS Here we investigate the Tupi expansion, a widely-dispersed language family in lowland South America, with ...
متن کاملDynamics of Vocabulary Evolution
In this paper we study the evolution of vocabulary from the dynamic system point of view. Based on some basic assumptions of the relationship between the brain dynamics and the pronunciation of vocabulary, we model the beginning of vocabulary evolution as an interacting, classifying and clustering process of brain’s dynamic representations of speakable and unspeakable meanings. As illustrative ...
متن کاملLemmatized Latent Semantic Model for Language Model Adaptation of Highly Inflected Languages
We present a method to adapt statistical N-gram models for large vocabulary continuous speech recognition of highly inflected languages. The method combines morphological analysis, latent semantic analysis (LSA) and fast marginal adaptation for building topic-adapted trigram models, based on a background language model and very short adaptation texts. We compare words, lemmas and morphemes as b...
متن کاملConstraint-Based Models of Lexical Borrowing
Linguistic borrowing is the phenomenon of transferring linguistic constructions (lexical, phonological, morphological, and syntactic) from a “donor” language to a “recipient” language as a result of contacts between communities speaking different languages. Borrowed words are found in all languages, and—in contrast to cognate relationships—borrowing relationships may exist across unrelated lang...
متن کاملTaste in Two Tongues: A Southeast Asian Study of Semantic Convergence
this article examines vocabulary for taste and flavor in two neighboring but unrelated languages (Lao and Kri) spoken in Laos, southeast Asia. there are very close similarities in underlying semantic distinctions made in the taste/flavor domain in these two languages, not just in the set of basic tastes distinguished (sweet, salty, bitter, sour, umami or glutamate), but in a series of further b...
متن کامل